DNA coding for a peptide of a papilloma virus main capside protein and use thereof

ABSTRACT

This invention relates to a DNA encoding a peptide of a papilloma virus major capsid protein. Furthermore, this invention deals with a papilloma virus genome containing such a DNA. In addition, this invention concerns proteins encoded by the papilloma virus genome and virus-like particles as well as antibodies directed thereagainst and the use thereof and the use thereof in diagnosis, treatment and vaccination.

this is a rule 371 application based on priority date of PCT/EP95/01697 filed May 04, 1995.

I. FIELD OF THE INVENTION

The present invention relates to a DNA encoding a peptide of a papilloma virus major capsid protein. Furthermore, this invention concerns a papilloma virus genome containing such a DNA. In addition, this invention relates to proteins encoded by the papilloma virus genome and to virus-like particles as well as antibodies directed thereagainst and the use thereof in diagnosis, treatment and vaccination.

II. BACKGROUND OF THE INVENTION

It is well known that papilloma viruses infect the epithelial tissue of humans and animals. Human papilloma viruses (referred to as HP viruses below) are found in benign, e.g., warts, condylomata in the genital region, and malign, e.g., carcinomas of the skin and uterus, epithelial neoplasms. Zur Hausen, 1989, Cancer Research 49:4677-4681. HP viruses are also considered for the development of malign tumors of the respiratory tract. Zur Hausen, 1976, Cancer Research 36:530. In addition, HP viruses are considered at least co-responsible for the development of squamous carcinomas of the lungs. Syrjanen, 1980, Lung 158:131-142.

Papilloma viruses have an icosahedral capsid without coat, which includes a circular, double-stranded DNA molecule of about 7,900 bp. The capsid comprises a major capsid protein (L1) and a minor capsid protein (L2). Both proteins, coexpressed or L1 expressed alone, result in vitro to the development of virus-like particles. Kirnbauer et al., 1993, Journal of Virology 67:6929-6936.

Papilloma viruses cannot be proliferated in monolayer cell culture. Therefore, their characterization is extremely difficult, the detection of papilloma viruses already creating considerable problems. This applies particularly to papilloma viruses in carcinomas of the skin. A reliable detection thereof and thus well-calculated steps thereagainst have not been possible by now.

Therefore, it is the object of the present invention to provide an agent serving for detecting papilloma viruses, particularly in carcinomas of the skin. Furthermore, an agent is to be provided which serves for taking therapeutic steps against these papilloma viruses.

According to the inventions this is achieved by the provision of the subject matters in the claims.

III. SUMMARY OF THE INVENTION

The present invention is directed to a DNA encoding a peptide of a papilloma virus major capsid protein.

The present invention is also directed to a papilloma virus genome containing such a DNA.

The present invention is further directed to proteins encoded by the papilloma virus genome and to virus-like articles as well as antibodies directed thereagainst and the use thereof in diagnosis, treatment and vaccination.

IV. BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows the base sequence and the amino acid sequence, derived therefrom, of a DNA encoding a peptide of L1 of a papilloma virus (SEQ ID NO:1). This DNA was deposited as plasmid VS93-1 with the DSM (German Collection of Microorganisms and Cell Cultures) under DSM 9133 on Apr. 12, 1994.

FIG. 2 shows the base sequence and the amino acid sequence, derived therefrom, of an DNA encoding a peptide of L1 of a papilloma virus (SEQ ID NO:2). This DNA was deposited as plasmid CR148-59 with the DSM under DSM 9134 on Apr. 12, 1994.

FIG. 3 shows the base sequence and the amino acid sequence, derived therefrom, of an DNA encoding a peptide of L1 of a papilloma virus (SEQ ID NO:3). This DNA was deposited as plasmid VS40-7 with the DSM under DSM 9135 on Apr. 12, 1994.

FIG. 4 shows the base sequence and the amino acid sequence, derived therefrom, of an DNA encoding a peptide of L1 of a papilloma virus (SEQ ID NO:4). This DNA was deposited as plasmid VS20-4 with the DSM under DSM 9136 on Apr. 12, 1994.

FIG. 5 shows the base sequence and the amino acid sequence, derived therefrom, of an DNA encoding a peptide of L1 of a papilloma virus (SEQ ID NO:5). This DNA was deposited as plasmid VS102-4 with the DSM under DSM 9137 on Apr. 12, 1994.

FIG. 6 shows the base sequence and the amino acid sequence, derived therefrom, of an DNA encoding a peptide of L1 of a papilloma virus (SEQ ID NO:6). This DNA was deposited as plasmid VS73-1 with the DSM under DSM 9138 on Apr. 12, 1994.

FIG. 7 shows the base sequence and the amino acid sequence, derived therefrom, of an DNA encoding a peptide of L1 of a papilloma virus (SEQ ID NO:7). This DNA was deposited as plasmid VS42-1 with the DSM under DSM 9139 on Apr. 12, 1994.

FIG. 8 shows the base sequence and the amino acid sequence, derived therefrom, of an DNA encoding a peptide of L1 of a papilloma virus (SEQ ID NO:8). This DNA was deposited as plasmid VS92-1 with the DSM under DSM 9140 on Apr. 12, 1994.

FIG. 9 shows the base sequence and the amino acid sequence, derived therefrom, of an DNA encoding a peptide of L1 of a papilloma virus (SEQ ID NO:9). This DNA was deposited as plasmid VS75-3 with the DSM under DSM 9141 on Apr. 12, 1994.

V. DETAILED DESCRIPTION OF THE INVENTION

According to its objective, the subject matter of this invention relates to a DNA encoding a peptide of a papilloma virus major capsid protein (L1), the peptide comprising at least a portion of the amino acid sequence of the amino acid sequence of FIG. 1, FIG. 2, FIG. 3, FIG. 4, FIG. 5, FIG. 6, FIG. 7, FIG. 8, or FIG. 9.

The expression "at least a portion of the amino acid sequence" of the individual figures may also include a variation of one or more amino acids.

Another subject matter of the invention deals with a DNA encoding a peptide of a papilloma virus major capsid protein, the DNA comprising at least a portion of the vase sequence or the base sequence of FIG. 1, FIG. 2, FIG. 3, FIG. 4, FIG. 5, FIG. 6, FIG. 7, FIG. 8, or FIG. 9.

The expression "at least a portion of the base sequence" refers to the fact that the base sequence of the individual figures may also include a variation of one or more base pairs.

The above DNA as described in the drawings has the following sequence homology with known papilloma viruses:

DNA of FIG. 1: 82.7% with HP virus 29

DNA of FIG. 2: 75% with HP virus 49

DNA of FIG. 3: 78.5% with HP virus 49

DNA of FIG. 4: 75.6% with HP virus 25

DNA of FIG. 5: 79% with HP virus 17

DNA of FIG. 6: 73.6% with HP virus 17

DNA of FIG. 7: 73.1% with HP virus 15

DNA of FIG. 8: 82.8% with HP virus 15

DNA of FIG. 9: 75.7% with HP virus 12.

According to the invention the above DNA may exist in a vector and an expression vector, respectively. A person skilled in the art is familiar with examples thereof. In the case of an expression vector for E. coli, these are e.g., pGEMEX, pUC derivatives and pGEX-2T. For the expression in yeast, e.g., pY100 and Ycpad1 have to be mentioned, while for the expression in animal cells, e.g., pKCR, pEF-BOS, cDM8 and pCEV4 have been indicated. The person skilled in the art knows suitable cells to express the above DNA present in an expression vector. Examples of such cells comprise the E. coli strains HB101, DH1, x1776, JM101 and JM109 the yeast strain Saccharomyces cerevisiae and the animal cells L, 3T3, FM3A, CHO, COS, Vero and Hela. The person skilled in the art knows in which way the above DNA has to be inserted in an expression vector. He is also familiar with the fact that the above DNA can be inserted in combination with a DNA encoding another protein and peptide, respectively, so that the above DNA can be expressed in the form of a fused protein.

Another subject matter of the invention relates to a papilloma virus genome which comprise the above DNA. The expression "papilloma virus genome" also comprises an incomplete genome, i.e. fragments of a papilloma virus genome, which comprise the above DNA. This may be, e.g., a DNA encoding L1or a portion thereof.

For providing the above papilloma virus genome it is possible to use a method which comprises the following steps:

(a) isolating the total DNA from a Biopsy of epithelial neoplasm,

(b) hybridizing the total DNA of (a) with the above DNA thereby detecting a papilloma virus genome included in the total DNA of (a), and

(c) cloning the total DNA of (a), including the papilloma virus genome, in a vector and optionally subcloning the resulting clone, all steps originating from conventional DNA recombination technique.

As regards the isolation, hybridization and cloning of cell DNA, reference is made to Sambrook et al., Molecular Cloning, A Laboratory Manual, second edition, Cold Spring Harbor Laboratory, 1989, by way of supplement.

The expression "epithelial neoplasm" comprises any neoplasms of the epithelial tissue in humans and animals. Examples of such neoplasms are warts, condylomata in the genital region and carcinomas of the skin. The latter are used preferable here to isolate the above papilloma virus genome.

The expression "vector" comprises any vectors suitable for cloning DNA which is chromosomal and extrachromosomal, respectively. Examples of such vectors are cosmids such as pWE15 and Super Cos1, and phages such as λ-phages, e.g., λZAP expression vector, λZAPII vector and λgt10 vector. λ-phages are preferred here. The above vectors are known and obtainable from the Stratagene company.

Papilloma virus genomes according to the invention may be integrated in chromosomal DNA or present in extrachromosomal form. The person skilled in the art is familiar with methods of clarifying this. He is also familiar with methods of finding out the optimum restriction enzymes for cloning the papilloma virus genomes. He will orient himself by genomes of known papilloma viruses. In particular, the person skilled in the art will observe the above-mentioned HP viruses correspondingly.

The provision of a papilloma virus genome referred to as VS93-1-G is described by way of example. For this purpose, the total DNA is isolated from a biopsy of a squamousepithelial carcinoma, cleaved by BamHI and separated electrophoretically in an agarose gel. Then, the agarose gel is subjected to a blotting method whereby the DAN is transferred to a nitrocellulose membrane. It is used in a hybridization method in which the DNA of FIG. 1 is employed, optionally in combination with a DNA of HP virus 29, as labeled sample. Hybridization with the papilloma virus DNA existing in the total DNA is obtained.

Furthermore, the above total DNA cleaved by BamHI is cloned in a λ-phage. The corresponding clones, i.e. the clones containing the papilloma virus DNA, are identified by hybridization with the DNA of FIG. 1, optionally in combination with a DNA of HP virus 29. The insert of these clones is then subjected to another cloning in a plasmid vector so as to obtain clone which contains the papilloma virus genome VS93-1-G. The genome is confirmed by sequencing.

Further papilloma virus genomes are provided in analogous manner. Corresponding to the DNAs used for their provision, they are referred to as: CR 148-59-G, VS40-7-G, VS20-4-G, VS102-4-G, VS73-1-G, VS42-1-G, VS-92-1-G and VS75-3-G, respectively.

Another subject matter of the invention relates to a protein which is encoded by the above papilloma virus genome. Such a protein is, e.g., a major capsid protein (L1) or a minor capsid protein (L2). An above protein is produced as usual. The production of L1 and L2, respectively, of the papilloma virus genome VS93-1-G is described by way of example. For this purpose, the HP virus 29 related to the DNA of FIG. 1 is used. The complete sequence thereof and the position of individual DNA regions encoding proteins are known. These DANs are identified on the papilloma virus genome VS93-1-G are parallel restriction cleavages of both genomes and subsequent hybridization with various fragments relating to the DNA encoding L1 and L2, respectively. They are confirmed by sequence. The DNA encoding L1 is referred to as VS93-1-G-L1-DNA and the DNA encoding L2 is referred to as VS93-1-G-L2-DNA.

Furthermore, the DNA encoding L1 and L2, respectively, is inserted in an expression vector. Examples thereof for E. coli, yeast and animal cells are mentioned above. In this connection, reference it made to the vector pGEX-2T as regards the expression in E. coli by way of supplement. Kirnbauer et al., supra. After inserting the VS93-1-G-L1 -DNA and VS93-1-g-L2-DNA, there is obtained pGEX-2T-VS93-1-G-L1 and pGEX-2T-VS93-1-G-L2, respectively. After the transformation of E. coli, these expression vectors express a glutathione S-transferase-L1-fused protein and glutathione S-transferase-L2-fused protein, respectively. These proteins are purified as usual.

For another expression of the above DNA encoding L1 and L2, respectively, there is mentioned the bacculovirus system and vaccinia virus system, respectively. Expression vectors usable for this purpose are, e.g., pEV mod. and pSynwtV1 for the bacculovirus system. Kirnbauer et al., supra. For the vaccinia virus system, particularly vectors including the vaccinia virus "early" (p7.5k) and "late" (Psynth, p11K) promoters are to be mention. Hagensee et al., 1993, Journal of Virology, 67:315-322. The bacculovirus system is preferred here. Having inserted the above DNA encoding L1 and L2, respectively, in pEV mod., there is obtained pEVmod.-VS93-1-G-L1 and pEVmod.-VS93-1-G-L2, respectively.

The former expression vector alone or both expression vectors together lead to the formation of virus-like particles after the infection of SF-9 insect cells. In the former case, such a particle comprises an L1 protein, whereas in the latter case it contains L2 protein in addition to an L1 protein.

A virus-like particle of the latter case is also obtained in that the above SV93-1-G-L1 -DNA and VS93-1-G-L2-DNA are together inserted in the expression vector pSynwtVI and the resulting pSynwtVI VS93-1-G-L1/L2 is used for infecting SF-9 insect cells. The above virus-like particles are purified as usual. They also represent a subject matter of this invention.

Another subject matter of the invention concerns an antibody directed against an above-mentioned protein and virus-like particle, respectively. Such an antibody is produced as usual. It is described by way of example for the production of an antibody which is directed against a virus-like particle comprising L1 of VS993-1-G. for this purpose, the virus-like particle is injected subcutaneously into BALB/c mice. This injection is repeated at intervals of 3 weeks each. About 2 weeks after the last injection, the serum containing the antibody is isolated and tested as usual.

In a preferred embodiment, the antibody is a monoclonal antibody. for the production thereof, splenocites are taken from the mice after the above fourth injection and they are fused with myeloma ells as usual. Further cloning is also carried out according to known methods.

The present invention renders possible to detect papilloma viruses, particularly in carcinomas of the skin. For this purpose, the DNA according to the invention can be used as such or included in another DNA. The latter may also be a papilloma virus genome or a portion thereof.

Furthermore, the present invention renders possible to provide formerly unknown viruses. They are found particularly in carcinomas of the skin. In addition, this invention supplies proteins and virus-like particles which originate from these papilloma viruses. Besides, antibodies are provided which are directed against these proteins and particles, respectively.

Thus, the present invention renders possible to take diagnostic and therapeutic steps in the case of papilloma virus diseases. Moreover, it furnishes the possibility of preparing a vaccine against papilloma virus infections. Therefore, the present invention represents a break-through in the field of papilloma virus research.

The invention is explained by the examples. The following preparations and examples are given to enable those skilled in the art to more clearly understand and to practice the present invention. The present invention is not limited in scope by the exemplified embodiments, which are intended as illustrations of single aspects of the invention only, and methods which are functionally equivalent are within the scope of the invention. Indeed, various modifications of the invention in addition to those described herein will become apparent to those skilled in the art from the foregoing description and accompanying drawings. Such modifications are intended to fall within the scope of the appended claims.

VI. EXAMPLES A. Example 1: Identification of papilloma virus genome VS93-1G

The total DNA is isolated from the biopsy WV-8495 of a squamous-epithelial carcinoma of an immunosuppressed person. 10 μg of this DNA are cleaved by the restriction enzyme BamHI and separated electrophoretically in a 0.5% agarose gel. At the same time, 10 μg of the above DNA which has not been cleaved are also separated. The result of the electrophoresis is shown in FIG. 10. It follows therefrom that a DNA molecule exists in the non-cleaved DNA in a form typical of an extrachromosomal DNA, i.e. a "supercoiled molecule" and "open circular molecule", respectively. This DNA molecule is cleaved by BamHI into two fragments.

The above agarose gel is subjected to a blotting method whereby the DNA is transferred from the agarose gel to a nitrocellulose membrane. It is used in a hybridization method in which the above DNA of FIG. 1 is employed in combination with the HP virus 29 DNA as p³² -labeled sample. Hybridization with the above DNA molecule is obtained.

The person skilled in the field of the DNA recombination technique is familiar with the above methods. Reference is made to Sambrook et al., supra, by way of supplement.

B. Example 2: Cloning of the papilloma virus genome VS93-1-G

The DNA, obtained from Example 1, of biopsy WV-8495 is cleaved by the restriction enzyme BamHI. The resulting fragments are inserted in a ligase reaction in which the BamHI-cleaved and dephosphorylated vector ZAP Express is present.

The resulting recombinant DNA molecules are packaged in bacteriophages which are used for infecting bacteria. The ZAP expression vector kit offered by the Stratagene company is used for these steps. The resulting phageplaques are then subjected to a hybridization method in which the p³² -labeled DNA of FIG. 1, used in Example 1, is employed in combination with the p³² -labeled HP virus 29 DNA. Hybridization with corresponding phageplaques is obtained. The two BamHI fragments of VS93-1-G are isolated therefrom and inserted in another ligase reaction together with a BamHI-cleaved, dephosphorylated plasmid vector, i.e. pBluescript. The resulting recombinant DNA molecules are used for the transformation of bacteria, i.e. E. coli. XI1-Blue. A bacterial clone containing the papilloma virus genome VS93-1-G is identified by restriction cleavages and hybridization with the above DNA samples, respectively. The plasmid of this bacterial clone is referred to as pBlue-VS93-1-G.

All references cited within the body of the instant specification are hereby incorporated by reference in their entirety.

    __________________________________________________________________________     #             SEQUENCE LISTING                                                   - -  - - (1) GENERAL INFORMATION:                                              - -    (iii) NUMBER OF SEQUENCES: 9                                            - -  - - (2) INFORMATION FOR SEQ ID NO: 1:                                     - -      (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 647 base - #pairs                                                  (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                  - -     (ii) MOLECULE TYPE: DNA                                                - -     (ix) FEATURE:                                                                   (A) NAME/KEY: CDS                                                              (B) LOCATION: 1 .. 6 - #45                                            - -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO: - #1:                            - - GGT AGA GGA CAG CCA TTA GGC GTG GGG TTA AG - #T GGA CAC CCT CTG TAT            48                                                                        Gly Arg Gly Gln Pro Leu Gly Val Gly Leu Se - #r Gly His Pro Leu Tyr              1               5 - #                 10 - #                 15               - - AAC AAA CTG AAT GAC ACT GAA AAC TCC AAC AT - #T GCA CAT GCT GAC AAT            96                                                                        Asn Lys Leu Asn Asp Thr Glu Asn Ser Asn Il - #e Ala His Asn Asp Asn                         20     - #             25     - #             30                   - - AGT CCT GAC TCC CGG GAC AAC ATT TCT GTT GA - #C TGT AAG CAA ACA CAA           144                                                                        Ser Pro Asp Ser Arg Asp Asn Ile Ser Val As - #p Cys Lys Gln Thr Gln                     35         - #         40         - #         45                       - - CTG TGC ATA CTG GGC TGT ACG CCC CCC ATG GG - #G GAA TAC TGG GGT AAG           192                                                                        Leu Cys Ile Leu Gly Cys Thr Pro Pro Met Gl - #y Glu Tyr Trp Gly Lys                 50             - #     55             - #     60                           - - GGT ACC CCT TGT GCA CGT ACT AAT ACT ACC CC - #A GGA GAC TGT CCT CCC           240                                                                        Gly Thr Pro Cys Ala Arg Thr Asn Thr Thr Pr - #o Gly Asp Cys Pro Pro             65                 - # 70                 - # 75                 - # 80        - - TTG GAG TTA ATG ACA TCT TAT ATT CAG GAT GG - #C GAC ATG GTG GAT ACC           288                                                                        Leu Glu Leu Met Thr Ser Tyr Ile Gln Asp Gl - #y Gln Met Val Asp Thr                             85 - #                 90 - #                 95               - - GGG TAT GGT GCC ATG GAC TTT ACT GCC CTG CA - #A TTT AAT AAG TCT GAC           336                                                                        Gly Tyr Gly Ala Met Asp Phe Thr Ala Leu Gl - #n Phe Asn Lys Ser Asp                        100      - #           105      - #           110                   - - GTG CCC CTT GAT ATT TGC CAG TCT ATT TGC AA - #A TAT CCC GAT TAT TTG           384                                                                        Val Pro Leu Asp Ile Cys Gln Ser Ile Cys Ly - #s Tyr Pro Asp Tyr Leu                     115         - #       120          - #       125                       - - GGC ATG GCT GCC GAC CCG TAT GGC GAT AGC AT - #G TTC TTT TTC CTC CGT           432                                                                        Gly Met Ala Ala Asp Pro Tyr Gly Asp Ser Me - #t Phe Phe Phe Leu Arg                130              - #   135              - #   140                           - - CGG GAA CAA CTG TTT GCC AGA CAC TTT TTC AA - #T CGT GCG GGT GAT GTT           480                                                                        Arg Glu Gln Leu Phe Ala Arg His Phe Phe As - #n Arg Ala Gly Asp Val            145                 1 - #50                 1 - #55                 1 -       #60                                                                               - - GGA GAC AAA ATT CCA GAA TCT TTG TAC CTC AA - #A GGG AGT AGC GGG         CGT      528                                                                     Gly Asp Lys Ile Pro Glu Ser Leu Tyr Leu Ly - #s Gly Ser Ser Gly Arg                           165  - #               170  - #               175               - - GAG ACT CCC GGC AGT GCT ATA TAC AGC CCC AC - #A CCC AGT GGG TCT ATG           576                                                                        Glu Thr Pro Gly Ser Ala Ile Tyr Ser Pro Th - #r Pro Ser Gly Ser Met                        180      - #           185      - #           190                   - - GTG ACC TCT GAG GCA CAA ATA TTC AAT AAG TC - #T TAC TGG CTA CAG CAA           624                                                                        Val Thr Ser Glu Ala Gln Ile Phe Asn Lys Se - #r Tyr Trp Leu Gln Gln                    195          - #       200          - #       205                       - - GCT CAA GGC CAA AAT AAC GGT AT      - #                  - #                    647                                                                      Ala Gln Gly Gln Asn Asn Gly                                                        210              - #   215                                                  - -  - - (2) INFORMATION FOR SEQ ID NO: 2:                                     - -      (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 668 base - #pairs                                                  (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                  - -     (ii) MOLECULE TYPE: DNA                                                - -     (ix) FEATURE:                                                                   (A) NAME/KEY: CDS                                                              (B) LOCATION: 1 .. 6 - #66                                            - -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO: - #2:                            - - TCA AGA GGA CAC CCA TTA GGA GTA GGG TCT AC - #A GGT CAT CCC CTA TTT            48                                                                        Ser Arg Gly His Pro Leu Gly Val Gly Ser Th - #r Gly His Pro Leu Phe            1                 5 - #                 10 - #                 15               - - AAT AAA GTG AAG GAT ACG GAA AAT GCT AAT AA - #T TAT ATA GTA ACA TCT            96                                                                        Asn Lys Val Lys Asp Thr Glu Asn Ala Asn As - #n Tyr Ile Val Thr Ser                         20     - #             25     - #             30                   - - AAG GAT GAT AGG CAG GAC ACC TCA TTT GAT CC - #T AAA CAG GTT CAA ATG           144                                                                        Lys Asp Asp Arg Gln Asp Thr Ser Phe Asp Pr - #o Lys Gln Val Gln Met                     35         - #         40         - #         45                       - - TTT ATT ATT GGC TGC GCA CCG TGC ATA GGT GA - #G CAC TGG GAT GCA GCC           192                                                                        Phe Ile Ile Gly Cys Ala Pro Cys Ile Gly Gl - #u His Trp Asp Ala Ala                 50             - #     55             - #     60                           - - AAG CCC TGT GAT GCT GAC AGA GGG GTA GGC AA - #A TGT CCA CCT TTG GAA           240                                                                        Lys Pro Cys Asp Ala Asp Arg Gly Val Gly Ly - #s Cys Pro Pro Lys Glu            65                  - # 70                 - # 75                 - # 80        - - CTG GTA AAT ACT GTA ATA GAA GAT GGA GAT AT - #G GTG GAT ATA GGT TTT           288                                                                        Leu Val Asn Thr Val Ile Glu Asp Gly Asp Me - #t Val Asp Ile Gly Phe                             85 - #                 90 - #                 95               - - GGA AAT ATA AAT AAT AAA ACC CTG TCA GCA AA - #T AAG TCA GAT GTC AGT           336                                                                        Gly Asn Ile Asn Asn Lys Thr Leu Ser Ala As - #n Lys Ser Asp Val Ser                        100      - #           105      - #           110                   - - TTA GAT ATA GTT AAT AAT ATT TGT AAG TAT CC - #A GAC TTT TTA AAA ATG           384                                                                        Leu Asp Ile Val Asn Asn Ile Cys Lys Tyr Pr - #o Asp Phe Leu Lys Met                    115          - #       120          - #       125                       - - GCC AAT GAC ATA TAT GGA GAC TCC TGT TTT TT - #T TAT GCT AGA CGG GAG           432                                                                        Ala Asn Asp Ile Tyr Gly Asp Ser Cys Phe Ph - #e Tyr Ala Arg Arg Glu                130              - #   135              - #   140                           - - CAA TGT TAT GCT AGA CAT TTT TTT GTT AGA GG - #A GGT AAT GTA GGA GAT           480                                                                        Gln Cys Tyr Ala Arg His Phe Phe Val Arg Gl - #y Gly Asn Val Gly Asp            145                 1 - #50                 1 - #55                 1 -       #60                                                                               - - GCT ATT CCT GAT GCT GCA GTG GGT CAG GAC AA - #T AAC TTT GTG TTG         CCT      528                                                                     Ala Ile Pro Asp Ala Ala Val Gly Gln Asp As - #n Asn Phe Val Leu Pro                           165  - #               170  - #               175               - - GCA GCT GTT GGA CAG GCC CAA AAC ACT TTG GG - #T AGC TCT ATT TAC GTG           576                                                                        Ala Ala Val Gly Gln Ala Gln Asn Thr Leu Gl - #y Ser Ser Ile Tyr Val                        180      - #           185      - #           190                   - - CCT ACC GTT AGT GGT TCT TTG GTA TCC ACA GA - #T GCA CAA TTA TTT AAT           624                                                                        Pro Thr Val Ser Gly Ser Leu Val Ser Thr As - #p Ala Gln Leu Phe Asn                    195          - #       200          - #       205                       - - AGG CCC TTT TGG CTA CAA CGA GCA CAG GGT CA - #T AAT AAC GGT AT                - #668                                                                     Arg Pro Phe Trp Leu Gln Arg Ala Gln Gly Hi - #s Asn Asn Gly                        210              - #   215              - #   220                           - -  - - (2) INFORMATION FOR SEQ ID NO: 3:                                     - -      (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 661 base - #pairs                                                  (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                  - -     (ii) MOLECULE TYPE: DNA                                                - -     (ix) FEATURE:                                                                   (A) NAME/KEY: CDS                                                              (B) LOCATION: 1 .. 6 - #60                                            - -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO: - #3:                            - - TCT AGG GGG CAA CCC TTG GGG GTA GGT TCT AC - #A GGC CAT CCT TTG TTC            48                                                                        Ser Arg Gly Gln Pro Leu Gly Val Gly Ser Th - #r Gly His Pro Leu Phe              1               5 - #                 10 - #                 15               - - AAT AAA GTA AAG GAT ACT GAA AAT TCA AAT AA - #T TAT ATA ACA ATG TCT            96                                                                        Asn Lys Val Lys Asp Thr Glu Asn Ser Asn As - #n Tyr Ile Thr Met Ser                         20     - #             25     - #             30                   - - AAA GAT GAT AGG CAG GAC ACC TCG TTT GAC CC - #T AAG CAG GTT CAA ATG           144                                                                        Lys Asp Asp Arg Gln Asp Thr Ser Phe Asp Pr - #o Lys Gln Val Gln Met                     35         - #         40         - #         45                       - - TTT ATT ATT GGC TGT GCA CCT TGT ATA GGG GA - #G CAC TGG GAT GCT GCC           192                                                                        Phe Ile Ile Gly Cys Ala Pro Cys Ile Gly Gl - #u His Trp Asp Ala Ala                 50             - #     55             - #     60                           - - AAA CCC TGT GAC GCT GAC AAA GGA GAC GGT AA - #A TGT CCA CCT TTA GAA           240                                                                        Lys Pro Cys Asp Ala Asp Lys Gly Asp Gly Ly - #s Cys Pro Pro Leu Glu             65                 - # 70                 - # 75                 - # 80        - - TTA GTA AAT ACA GTT ATT GAG GAT GGG GAT AT - #G GTG GAT ATA GGT TTT           288                                                                        Leu Val Asn Thr Val Ile Glu Asp Gly Asp Me - #t Val Asp Ile Gly Phe                             85 - #                 90 - #                 95               - - GGT AAC ATA AAT AAT AAA ACC TTG TCA GCA AA - #T AAA TCA GAT GTC AGT           336                                                                        Gly Asn Ile Asn Asn Lys Thr Leu Ser Ala As - #n Lys Ser Asp Val Ser                        100      - #           105      - #           110                   - - TTG GAT ATA GTT AAT AAC ATT TGT AAG TAT CC - #A GAC TTC CTT AAA ATG          384                                                                         Leu Asp Ile Val Asn Asn Ile Cys Lys Tyr Pr - #o Asp Phe Leu Lys Met                    115          - #       120          - #       125                       - - GCC AAT GAC ATA TAT GGG GAC TCC TGT TTT TT - #T TAT GCC AGG CGG GAA          432                                                                         Ala Asn Asp Ile Tyr Gly Asp Ser Cys Phe Ph - #e Tyr Ala Arg Arg Glu                130              - #   135              - #   140                           - - CAA TGT TAT GCT AGA CAC TTT TTT GTT AGG GG - #A GGC AAT GTA GGC GAT          480                                                                         Gln Cys Tyr Ala Arg His Phe Phe Val Arg Gl - #y Gly Asn Val Gly Asp            145                 1 - #50                 1 - #55                 1 -       #60                                                                               - - CGA ATT CCT AAT GCT GCA GTG GGT CAG GAC AA - #T AAT TTT ATG TTA         CCT     528                                                                      Arg Ile Pro Asn Ala Ala Val Gly Gln Asp As - #n Asn Phe Met Leu Pro                           165  - #               170  - #               175               - - GCA GCC GCT GGG CAG GCT CAA AAC ACT TTG GG - #C AAC TCT ATT TAT GTT          576                                                                         Ala Ala Ala Gly Gln Ala Gln Asn Thr Leu Gl - #y Asn Ser Ile Tyr Val                        180      - #           185      - #           190                   - - CCC ACG GTC AGT GGT TCT TTG GTG TCC ACA GA - #T GCT CAA TTA TTT AAC          624                                                                         Pro Thr Val Ser Gly Ser Leu Val Ser Thr As - #p Ala Gln Leu Phe Asn                    195          - #       200          - #       205                       - - AGG CCA TTT TGG CTG CAA CGA GCA CAA GGT CA - #C AAC A                    - #    661                                                                      Arg Pro Phe Trp Leu Gln Arg Ala Gln Gly Hi - #s Asn                                210              - #   215              - #   220                           - -  - - (2) INFORMATION FOR SEQ ID NO: 4:                                     - -      (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 677 base - #pairs                                                  (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                  - -     (ii) MOLECULE TYPE: DNA                                                - -     (ix) FEATURE:                                                                   (A) NAME/KEY: CDS                                                              (B) LOCATION: 1 .. 6 - #75                                            - -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO: - #4:                            - - GGA AGT GGT CTT CCA TTA GGC ATA GGC AGC AG - #T GGT CAC CCT CTG TTT            48                                                                        Gly Ser Gly Leu Pro Leu Gly Ile Gly Ser Se - #r Gly His Pro Leu Phe              1               5 - #                10  - #                15                - - AAC AAG GTA AAT GAT ACA GAA AAT GGC AAC AC - #A TAT AAA GGG ACA ACT            96                                                                        Asn Lys Val Asn Asp Thr Glu Asn Gly Asn Th - #r Tyr Lys Gly Thr Thr                        20      - #            25      - #            30                    - - AAA GAT GAT AGA CAA AAC ATT TCA TTT GAT CC - #T AAA CAA TTA CAG ATG           144                                                                        Lys Asp Asp Arg Gln Asn Ile Ser Phe Asp Pr - #o Lys Gln Leu Gln Met                    35          - #        40          - #        45                        - - TTT ATA ATT GGC TGT ACA CCA TGT ATT GGT GA - #A CAT TGG GAT AAG GCT           192                                                                        Phe Ile Ile Gly Cys Thr Pro Cys Ile Gly Gl - #u His Trp Asp Lys Ala                50              - #    55              - #    60                            - - CCT GCA TGT GTT AAT GAT ATT CAA CAA GGT AG - #T TGC CCA CCA ATA GAA           240                                                                        Pro Ala Lys Val Asn Asp Ile Gln Gln Gly Se - #r Cys Pro Pro Ile Glu            65                  - #70                  - #75                  - #80         - - TTA GTT AAC ACA TAC ATA CAG GGT GGA GAT AT - #G GCT GAT ATA GGA TAT           288                                                                        Leu Val Asn Thr Tyr Ile Gln Gly Gly Asp Me - #t Ala Asp Ile Gly Tyr                            85  - #                90  - #                95                - - GGC AAT CTA AAT TTT AAA GCT TTA CAG CAA AA - #T AGA TCA GAT GTT AGC           336                                                                        Gly Asn Leu Asn Phe Lys Ala Leu Gln Gln As - #n Arg Ser Asp Val Ser                        100      - #           105      - #           110                   - - TTG GAT ATT GTA GAT GAA ATA TGC AAA TAT CC - #T GAC TTT TTA CGA ATG           384                                                                        Leu Asp Ile Val Asp Glu Ile Cys Lys Tyr Pr - #o Asp Phe Leu Arg Met                    115          - #       120          - #       125                       - - CAA AAT GAT GTA TAT GGC GAT GCC TGT TTT TT - #T TAT GCT CGA CGG GAG           432                                                                        Gln Asn Asp Val Tyr Gly Asp Ala Cys Phe Ph - #e Tyr Ala Arg Arg Glu                130              - #   135              - #   140                           - - CAA TGT TAT GCC AGG CAC TTT TTT GTG CGT GG - #T GGC AAA CCT GGT GAT           480                                                                        Gln Cys Tyr Ala Arg His Phe Phe Val Arg Gl - #y Gly Lys Pro Gly Asp            145                 1 - #50                 1 - #55                 1 -       #60                                                                               - - GAT ATA CCT GGT GCC CAA ATT GAT GCA GGG TC - #A CAT AAA AAT GAA         TAT      528                                                                     Asp Ile Pro Gly Ala Gln Ile Asp Ala Gly Se - #r His Lys Asn Glu Tyr                            165 - #                170 - #                175              - - TAC ATA CAG GCA GCT TCA GAC CAA TCA CAA AA - #T AGT TTG GGG AAT TCT           576                                                                        Tyr Ile Gln Ala Ala Ser Asp Gln Ser Gln As - #n Ser Leu Gly Asn Ser                        180      - #           185      - #           190                   - - ATG TAT TTC CCA ACT ATC AGT GGC TCA TTA GT - #T TCA AGT GAT GCT CAA           624                                                                        Met Tyr Phe Pro Thr Ile Ser Gly Ser Leu Va - #l Ser Ser Asp Ala Gln                    195          - #       200          - #       205                       - - TTA TTT AAT AGG CCC TTC TGG CTA CAG CGA GC - #A CAA GGC CAA AAC AAC           672                                                                        Leu Phe Asn Arg Pro Phe Trp Leu Gln Arg Al - #a Gln Gly Gln Asn Asn                210              - #   215              - #   220                           - - GGG AT                - #                  - #                  - #                677                                                                   Gly                                                                            225                                                                             - -  - - (2) INFORMATION FOR SEQ ID NO: 5:                                     - -      (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 674 base - #pairs                                                  (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                  - -     (ii) MOLECULE TYPE: DNA                                                - -     (ix) FEATURE:                                                                   (A) NAME/KEY: CDS                                                              (B) LOCATION: 1 .. 6 - #72                                            - -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO: - #5:                            - - TCA AGG GGA CAG CCA TTG GGT GTA GGA ACA TC - #A GGT CAT CCT TTA TTT            48                                                                        Ser Arg Gly Gln Pro Leu Gly Val Gly Thr Se - #r Gly His Pro Leu Phe            1               5   - #                10  - #                15                - - AAC AAA GTC AGG GAT ACT GAA AAC TCA GGT AA - #C TAT CAA GCA GTT TCT            96                                                                        Asn Lys Val Arg Asp Thr Glu Asn Ser Gly As - #n Tyr Gln Ala Val Ser                        20      - #            25      - #            30                    - - CAG GAT GAC AGA CAA AAT ACA TCT TTT GAT CC - #T AAA CAA GTG CAA ATG           144                                                                        Gln Asp Asp Arg Gln Asn Thr Ser Phe Asp Pr - #o Lys Gln Val Gln Met            35                  - #40                  - #45                                - - TTT GTC ATT GGC TGT GTG CCG TGT ATG GGT GA - #A CAT TGG GAC AAA GCT           192                                                                        Phe Val Ile Gly Cys Val Pro Cys Met Gly Gl - #u His Trp Asp Lys Ala                50              - #    55              - #    60                            - - AAG GTT TGT GAA TCA GAA GCA AAT AAT CAA CA - #A GGC TTA TGT CCA CCC           240                                                                        Lys Val Cys Glu Ser Glu Ala Asn Asn Gln Gl - #n Gly Leu Cys Pro Pro            65                  - #70                  - #75                  - #80         - - ATA GAG TTA AAA AAT TCA GTA ATT GAA GAT GG - #A GAT ATG TTT GAT ATA           288                                                                        Ile Glu Leu Lys Asn Ser Val Ile Glu Asp Gl - #y Asp Met Phe Asp Ile                            85  - #                90  - #                95                - - GGC TTT GGA AAT ATT AAT AAC AAA GCA CTA TC - #T TAT AAC AAG TCA GAT           336                                                                        Gly Phe Gly Asn Ile Asn Asn Lys Ala Leu Se - #r Tyr Asn Lys Ser Asp                        100      - #           105      - #           110                   - - GTT AGT TTA GAT ATA GTT AAT GAA GTG TGC AA - #A TAT CCA GAC TTT TTA           384                                                                        Val Ser Leu Asp Ile Val Asn Glu Val Cys Ly - #s Tyr Pro Asp Phe Leu                    115          - #       120          - #       125                       - - ACC ATG GCT AAT GAT GTG TAT GGA GAT GCT TG - #T TTT TTC TTT GCT AGA           432                                                                        Thr Met Ala Asn Asp Val Tyr Gly Asp Ala Cy - #s Phe Phe Phe Ala Arg                130              - #   135              - #   140                           - - CGA GAA CAA TGT TAT GCC AGA CAT TAT TTT GT - #T AGG GGA GGC AAT GTT           480                                                                        Arg Glu Gln Cys Tyr Ala Arg His Tyr Phe Va - #l Arg Gly Gly Asn Val            145                 1 - #50                 1 - #55                 1 -       #60                                                                               - - GGC GAT GCA ATC CCT GAT GGA GCA GTA CAA CA - #G GAT CAC AAC TAT         TAT      528                                                                     Gly Asp Ala Ile Pro Asp Gly Ala Val Gln Gl - #n Asp His Asn Tyr Tyr                           165  - #               170  - #               175               - - TTA CCT GCA CAA AAT GCA CAG CAA CAA CAC AC - #C TTG GGA AAT TCT ATA           576                                                                        Leu Pro Ala Glu Asn Ala Gln Gln Gln His Th - #r Leu Gly Asn Ser Ile                        180      - #           185      - #           190                   - - TAT TAT CCA ACT GTT AGT GGG TCT CTT GTA AC - #A TCT GAT GCT CAG TTA           624                                                                        Tyr Tyr Pro Thr Val Ser Gly Ser Leu Val Th - #r Ser Asp Ala Gln Leu                    195          - #       200          - #       205                       - - TTT AAT AGA CCA TTT TGG TTA CAA CGT GCT CA - #A GGA CAA AAC AAC GGT           672                                                                        Phe Asn Arg Pro Phe Trp Leu Glu Arg Ala Gl - #u Gly Gln Asn Asn Gly                210              - #   215              - #   220                           - - AT                  - #                  - #                  - #                  674                                                                    - -  - - (2) INFORMATION FOR SEQ ID NO: 6:                                     - -      (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 662 base - #pairs                                                  (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                  - -     (ii) MOLECULE TYPE: DNA                                                - -     (ix) FEATURE:                                                                   (A) NAME/KEY: CDS                                                              (B) LOCATION: 1 .. 6 - #60                                            - -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO: - #6:                            - - GGT AGT GGG CAA CCA TTA GGT GTA GGC ACC AC - #A GGA CAT CCA CTG TTT            48                                                                        Gly Ser Gly Gln Pro Leu Gly Val Gly Thr Th - #r Gly His Pro Leu Phe            1               5   - #                10  - #                15                - - AAT AAA CTT AGA GAT TCA GAA AAT TCT GCA GA - #A CGT CTG GAA GGA ACA            96                                                                        Asn Lys Leu Arg Asp Ser Glu Asn Ser Ala Gl - #u Arg Leu Glu Gly Thr                        20      - #            25      - #            30                    - - AGT GAT GAT AGG AGG AAT ATA TCA TTT GAT CC - #T AAG CAA GTG CAA ATG           144                                                                        Ser Asp Asp Arg Arg Asn Ile Ser Phe Asp Pr - #o Lys Gln Val Gln Met            35                  - #40                  - #45                                - - TTT GTG ATA GGC TGC ACC CCC TGT TTA GGG GA - #G TAT TGG GAT ACA GCT           192                                                                        Phe Val Ile Gly Cys Thr Pro Cys Leu Gly Gl - #u Tyr Trp Asp Thr Ala                50              - #   55               - #    60                            - - CCA GTA TGT AAA GAT GCA GGA AGT CAA TTA GG - #C CTT TGC CCT CCA TTA           240                                                                        Pro Val Cys Lys Asp Ala Gly Ser Gln Leu Gl - #y Leu Cys Pro Pro Leu            65                  - #70                  - #75                  - #80         - - GAA TTA AAA AAC AGT GTT ATA GAA GAT GGC GA - #T ATG TTT GAT ATA GGA           288                                                                        Glu Leu Lys Asn Ser Val Ile Glu Asp Gly As - #p Met Phe Asp Ile Gly                            85  - #                90  - #                95                - - TTT GGC AAT ATT AAC AAC AAA ACA TTA AGT TT - #T AAT AAG TCA GAT GTT           336                                                                        Phe Gly Asn Ile Asn Asn Lys Thr Leu Ser Ph - #e Asn Lys Ser Asp Val                        100      - #           105      - #           110                   - - AGT GTG GAC ATT GTT AAT GAA ATT TGT AAA TA - #T CCT GAT TTT TTA ACT           384                                                                        Ser Val Asp Ile Val Asn Glu Ile Cys Lys Ty - #r Pro Asp Phe Leu Thr                    115          - #       120          - #       125                       - - ATG TCC AAT GAT GTT TAT GGA GAC TCT TGC TT - #T TTC TTT GCT CGC AGA           432                                                                        Met Ser Asn Asp Val Tyr Gly Asp Ser Cys Ph - #e Phe Phe Ala Arg Arg                130              - #   135              - #   140                           - - GAG CGA TGT TAT GCA AGG CAT TAT TTT GTA CG - #C GGA GGG GCA GTG GGT           480                                                                        Glu Arg Cys Tyr Ala Arg His Tyr Phe Val Ar - #g Gly Gly Ala Val Gly            145                 1 - #50                 1 - #55                 1 -       #60                                                                               - - GAT TTA ATA CCA GAT GCT ACA GTT AAT CAG GA - #C CAT AAA TAT TAC         TTA      528                                                                     Asp Leu Ile Pro Asp Ala Thr Val Asn Gln As - #p His Lys Tyr Tyr Leu                           165  - #               170  - #               175               - - CCA GCA AAT CCA CCT GCC ACA TTG GAA AAC TC - #T ACA TAC TTT CCG ACT           576                                                                        Pro Ala Asn Pro Pro Ala Thr Leu Glu Asn Se - #r Thr Tyr Phe Pro Thr                        180      - #           185      - #           190                   - - GCT AGT GGC TCC TTA GTG ACA TCT GAT GCA CA - #A TTA TTT AAT AGG CCC           624                                                                        Ala Ser Gly Ser Leu Val Thr Ser Asp Ala Gl - #n Leu Phe Asn Arg Pro                    195          - #       200          - #       205                       - - TTT TGG TTA AAA CGT GCA CAA GGT CAT AAT AA - #T GGT AT                    - #    662                                                                     Phe Trp Leu Lys Arg Ala Gln Gly His Asn As - #n Gly                                210              - #   215              - #   220                           - -  - - (2) INFORMATION FOR SEQ ID NO: 7:                                     - -      (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 665 base - #pairs                                                  (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                  - -     (ii) MOLECULE TYPE: DNA                                                - -     (ix) FEATURE:                                                                   (A) NAME/KEY: CDS                                                              (B) LOCATION: 1 .. 6 - #63                                            - -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO: - #7:                            - - GGT AGG GGG CAA CCA TTT GGG GTA GGC ACT AC - #A GGT CAT CCA TTA TTT            48                                                                        Gly Arg Gly Gln Pro Phe Gly Val Gly Thr Th - #r Gly His Pro Leu Phe              1               5 - #                 10 - #                 15               - - AAC AAA TTA CGT GAT GCA GAA AAT TCC AGC GA - #A CGT CAG GGA GAT ACT            96                                                                        Asn Lys Leu Arg Asp Ala Glu Asn Ser Ser Gl - #u Arg Gln Gly Asp Thr                         20     - #             25     - #             30                   - - GCT GCA GAT GAC AGA ATG AAT ATA TCT TTT GA - #T CCT AAG CAG GTA CAA           144                                                                        Ala Ala Asp Asp Arg Met Asn Ile Ser Phe As - #p Pro Lys Gln Val Gln                     35         - #         40         - #         45                       - - ATG TTC ATA ATA GGT TGC ACA CCG TGT TTA GG - #T GAA TAT TGG GAT CAA           192                                                                        Met Phe Ile Ile Gly Cys Thr Pro Cys Leu Gl - #y Glu Tyr Trp Asp Gln                 50             - #     55             - #     60                           - - GCG CCT GTA TGT AAA GAT GCA GGT AAC CAA AT - #G GGC TTA TGT CCT CCT           240                                                                        Ala Pro Val Cys Lys Asp Ala Ser Asn Glu Me - #t Gly Leu Cys Pro Pro            65                  - #70                  - #75                  - #80         - - CTT GAA CTA AAG AAT AGT GTC ATA GAA GAT GG - #A GAT ATG TTT GAT ATA           288                                                                        Leu Glu Leu Lys Asn Ser Val Ile Glu Asp Gl - #y Asp Met Phe Asp Ile                            85  - #                90  - #                95                - - GGC TTT GGT AAC ATT AAT AAT AAG ACA CTG TC - #A TTC AAT AGA TCA GAT           336                                                                        Gly Phe Gly Asn Ile Asn Asn Lys Thr Lys Se - #r Phe Asn Arg Ser Asp                        100      - #           105      - #           110                   - - GTT AGT TTA GAT ATT GTA AAT GAA ATA TGC AA - #A TAT CCA GAT TTT TTA           384                                                                        Val Ser Leu Asp Ile Val Asn Glu Ile Cys Ly - #s Tyr Pro Asp Phe Leu                    115          - #       120          - #       125                       - - ACA ATG TCC AAT GAT GTT TAT GGT GAC TCC TG - #T TTT TTT TGT GCT CGA           432                                                                        Thr Met Ser Asn Asp Val Tyr Gly Asp Ser Cy - #s Phe Phe Cys Ala Arg                130              - #   135              - #   140                           - - AGA GAG CAA TGT TAT GCT AGA CAT TAT TTT GT - #A CGA GGC GGT GTT GTT           480                                                                        Arg Glu Gln Cys Tyr Ala Arg His Tyr Phe Va - #l Arg Gly Gly Val Val            145                 1 - #50                 1 - #55                 1 -       #60                                                                               - - GGA GAT TCT ATA CCA GAC GGT GCA GTC CAG CA - #G AGT AAC AAA TAT         TAT      528                                                                     Gly Asp Ser Ile Pro Asp Gly Ala Val Gln Gl - #n Ser Asn Lys Tyr Tyr                           165  - #               170  - #               175               - - TTA GCT TCA GCT CAA AAT AAT AGC TTG GAA AA - #T TCT ACC TAT TTC CCA           576                                                                        Leu Ala Ser Ala Glu Asn Asn Ser Leu Glu As - #n Ser Thr Tyr Phe Pro                        180      - #           185      - #           190                   - - ACT GTA AGT GGT TCT TTA GTG ACT TCT GAT GC - #T CAG CTA TTT AAC AGA           624                                                                        Thr Val Ser Gly Ser Leu Val Thr Ser Asp Al - #a Gln Leu Phe Asn Arg                    195          - #       200          - #       205                       - - CCC TTT TGG TTA AAG CGT GCT CAA GGG CAT AA - #T AAT GGA AT                  - #  665                                                                     Pro Phe Trp Leu Lys Arg Ala Glu Gly His As - #n Asn Gly                            210              - #   215              - #   220                           - -  - - (2) INFORMATION FOR SEQ ID NO: 8:                                     - -      (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 674 base - #pairs                                                  (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                  - -     (ii) MOLECULE TYPE: DNA                                                - -     (ix) FEATURE:                                                                   (A) NAME/KEY: CDS                                                              (B) LOCATION: 1 .. 6 - #72                                            - -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO: - #8:                            - - GGA AGA GGT CTC CAT TTG GGT GTA GGT ACA GC - #A GGC CAT CCA CTA TTC            48                                                                        Gly Arg Gly Leu His Leu Gly Val Gly Thr Al - #a Gly His Pro Leu Phe            1               5   - #                10  - #                15                - - AAT AAA GTT AGA GAT ACA GAA AAT AAT AGT GG - #C TAT CAA GAT ACG TCT            96                                                                        Asn Lys Val Arg Asp Thr Glu Asn Asn Ser Gl - #y Tyr Gln Asp Thr Ser                        20      - #            25      - #            30                    - - ACG GAT GAC AGA CAA AAT ACA TCA TTT GAT CC - #A AAA CAA GTT CAA ATG           144                                                                        Thr Asp Asp Arg Gln Asn Thr Ser Phe Asp Pr - #o Lys Gln Val Gln Met                    35          - #        40          - #        45                        - - TTT GTA GTA GGA TGT GCT CCT TGT TTG GGA GA - #A CAT TGG GAT AAA GCT           192                                                                        Phe Val Val Gly Cys Ala Pro Cys Leu Gly Gl - #u His Trp Asp Lys Ala                50              - #    55              - #    60                            - - CCT GTC TGT GAC TCA GAT AAA AAT AAC CAG GC - #T GGA AAA TGC CCT CCA           240                                                                        Pro Val Cys Asp Ser Asp Lys Asn Asn Gln Al - #a Gly Lys Cys Pro Pro            65                  - #70                  - #75                  - #80         - - TTA GAA CTG AGA AAC ACA GTA ATA GAA GAT GG - #A GAT ATG ATT GAT ATA           288                                                                        Leu Glu Leu Arg Asn Thr Val Ile Glu Asp Gl - #y Asp Met Ile Asp Ile                            85  - #                90  - #                95                - - GGC TTT GGC AAT ATA AAC AAC AAG GTT TTA TC - #A GTT ACT AAG TCA GAT           336                                                                        Gly Phe Gly Asn Ile Asn Asn Lys Val Leu Se - #r Val Thr Lys Ser Asp                        100      - #           105      - #           110                   - - GTT AGT CTG GAT ATA GTT AAT GAA ACT TGT AA - #G TAT CCA GAT TTT TTA           384                                                                        Val Ser Leu Asp Ile Val Asn Glu Thr Cys Ly - #s Tyr Pro Asp Phe Leu            115                 1 - #20                 1 - #25                             - - ACT ATG GCC AAT GAT GTA TAT GGT GAC TCT TG - #T TTT TTC TTT GCA AGG           432                                                                        Thr Met Ala Asn Asp Val Tyr Gly Asp Ser Cy - #s Phe Phe Phe Ala Arg                130              - #   135              - #   140                           - - AGA GAA CAG TGT TAT GCT AGA CAT TAT TAT GT - #T AGG GGA GGT GTA GTA           480                                                                        Arg Glu Gln Cys Tyr Ala Arg His Tyr Tyr Va - #l Arg Gly Gly Val Val            145                 1 - #50                 1 - #55                 1 -       #60                                                                               - - GGT GAT GCT ATT CCT GAT GAA GCT GTG AAT CA - #A GAT AAA AAC TTT         GTG      528                                                                     Gly Asp Ala Ile Pro Asp Glu Ala Val Asn Gl - #n Asp Lys Asn Phe Val                           165  - #               170  - #               175               - - TTA CCT GCA CAA GGC ACT CAG CAA CAA AAG GA - #T ATA GCT AGT TCT ATA           576                                                                        Leu Pro Ala Gln Gly Thr Gln Gln Gln Lys As - #p Ile Ala Ser Ser Ile                        180      - #           185      - #           190                   - - TAT TTT CCA ACT GTT AGT GGT TCC TTA GTA AC - #T TCT GAT GCT CAA TTA           624                                                                        Tyr Phe Pro Thr Val Ser Gly Ser Leu Val Th - #r Ser Asp Ala Gln Leu                    195          - #       200          - #       205                       - - TTT AAC AGA CCA TTT TGG TTA CGC AGA GCA CA - #A GGG CAA AAT AAC GGG           672                                                                        Phe Asn Arg Pro Phe Trp Leu Arg Arg Ala Gl - #n Gly Gln Asn Asn Gly                210              - #   215              - #   220                           - - AT                  - #                  - #                  - #                  674                                                                    - -  - - (2) INFORMATION FOR SEQ ID NO: 9:                                     - -      (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 686 base - #pairs                                                  (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                  - -     (ii) MOLECULE TYPE: DNA                                                - -     (ix) FEATURE:                                                                   (A) NAME/KEY: CDS                                                              (B) LOCATION: 1 .. 6 - #84                                            - -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO: - #9:                            - - GGG AGA GGA CAG CCA TTA GGC GTT GGT ACC AG - #T GGA CAT CCA CTG TTT            48                                                                        Gly Arg Gly Gln Pro Leu Gly Val Gly Thr Se - #r Gly His Pro Leu Phe            1               5   - #                10  - #                15                - - AAC AAA GTT AAT GAT GCC GAA AAT CCC TTA GC - #T TAC AGG GCA CAG GCC            96                                                                        Asn Lys Val Asn Asp Ala Glu Asn Pro Leu Al - #a Tyr Arg Ala Gln Ala                        20      - #            25      - #            30                    - - TTT TCT ACT GAT GAT AGG CAA AAC ACA TCC TT - #T GAT CCT AAA CAA ATA           144                                                                        Phe Ser Thr Asp Asp Arg Gln Asn Thr Ser Ph - #e Asp Pro Lys Gln Ile                    35          - #        40          - #        45                        - - CAA ATG TTT ATA ATA GGT TGT GCA CCC TGT AT - #T GGA GAG CAT TGG GAT           192                                                                        Gln Met Phe Ile Ile Gly Cys Ala Pro Cys Il - #e Gly Glu His Trp Asp                50              - #    55              - #    60                            - - GTA GGT GAA CGT TGT GCA GGA GCC AAT AAT GA - #A AAT GGT CGA TGC CCC           240                                                                        Val Gly Glu Arg Cys Ala Gly Ala Asn Asn Gl - #u Asn Gly Arg Cys Pro            65                  - #70                  - #75                  - #80         - - CCT ATT AAA TTG GTA AAT TCA GTC ATC CAA GA - #T GGA GAT ATG GCA GAT           288                                                                        Pro Ile Lys Leu Val Asn Ser Val Ile Gln As - #p Gly Asp Met Ala Asp                            85  - #                90  - #                95                - - ATT GGT TAT GGA AAC CTA AAT TTC CGT ACC TT - #A CAG GAA AAC AGA TCT           336                                                                        Ile Gly Tyr Gly Asn Leu Asn Phe Arg Thr Le - #u Gln Glu Asn Arg Ser                        100      - #           105      - #           110                   - - GAT GTA AGT TTA GAT ATA GTG AAT GAA ACC TG - #T AAA TAT CCA GAC TTT           384                                                                        Asp Val Ser Leu Asp Ile Val Asn Glu Thr Cy - #s Lys Tyr Pro Asp Phe                    115          - #       120          - #       125                       - - TTA AAG ATG CAG AAT GAT ATA TAT GGC GAT TC - #T TGC TTT TTC TTT GCT           432                                                                        Leu Lys Met Gln Asn Asp Ile Tyr Gly Asp Se - #r Cys Phe Phe Phe Ala                130              - #   135              - #   140                           - - CGC CGG GAG CAA TGT TAT GCA AGA CAT TTT TT - #T GTT CGT GGG GGT AAG           480                                                                        Arg Arg Glu Gln Cys Tyr Ala Arg His Phe Ph - #e Val Arg Gly Gly Lys            145                 1 - #50                 1 - #55                 1 -       #60                                                                               - - GCG GGG GAT GAC ATT CCT GGT GCG CAA ATC GA - #T GCA GGT ACA TAT         AAA      528                                                                     Ala Gly Asp Asp Ile Pro Gly Ala Gln Ile As - #p Ala Gly Thr Tyr Lys                           165  - #               170  - #               175               - - AAT GAT TTT TAC ATA CCT GGA GCG TCA GGT CA - #G ACA CAA AAG AAT ATA           576                                                                        Asn Asp Phe Tyr Ile Pro Gly Ala Ser Gly Gl - #n Thr Gln Lys Asn Ile                        180      - #           185      - #           190                   - - GGT AAC TCG ATG TAT TTC CCA ACA GTA AGT GG - #C TCA TTG GTG TCT AGT           624                                                                        Gly Asn Ser Met Tyr Phe Pro Tyr Val Ser Gl - #y Ser Leu Val Ser Ser                    195          - #       200          - #       205                       - - GAT GCT CAA TTG TTT AAT AGG CCC TTC TGG CT - #C CAA CGG GCG CAG GGG           672                                                                        Asp Ala Gln Leu Phe Asn Arg Pro Phe Trp Le - #u Gln Arg Ala Gln Gly                210              - #   215              - #   220                           - - CAA AAC AAC GGA AT           - #                  - #                       - #    686                                                                   Gln Asn Asn Gly                                                                225                                                                           __________________________________________________________________________ 

What is claimed is:
 1. An isolated DNA encoding a peptide an L1 papilloma virus major capsid protein, wherein said peptide comprises the amino acid sequence of FIG. 2 (SEQ ID NO: 2), FIG. 3 (SEQ ID NO: 3), FIG. 4 (SEQ ID NO: 4) FIG. 5 (SEQ ID NO: 5), FIG. 6 (SEQ ID NO: 6), FIG. 7 (SEQ ID NO: 7), FIG. 8 (SEQ ID NO: 8), or FIG. 9 (SEQ ID NO: 9).
 2. An isolated DNA encoding an L1 papilloma virus major capsid protein, wherein said L1 polypeptide comprises the amino acid sequence of FIG. 1 (SEQ ID NO: 1), FIG. 2 (SEQ ID NO: 2), FIG. 3 (SEQ ID NO: 3), FIG. 4 (SEQ ID NO: 4), FIG. 5 (SEQ ID NO: 5), FIG. 6 (SEQ ID NO: 6), FIG. 7 (SEQ ID NO: 7), FIG. 8 (SEQ ID NO: 8), or FIG. 9 (SEQ ID NO: 9).
 3. The isolated DNA of claim 1, wherein said DNA comprises the base sequence of FIG. 2 (SEQ ID NO: 2), FIG. 3 (SEQ ID NO: 3), FIG. 4 (SEQ ID NO: 4), FIG. 5 (SEQ ID NO: 5), FIG. 6 (SEQ ID NO: 6), FIG. 7 (SEQ ID NO: 7), FIG. 8 (SEQ ID NO: 8), or FIG. 9 (SEQ ID NO: 9), or the complement thereof.
 4. The isolated DNA of claim 2, wherein the DNA encoding said L1 papilloma virus major capsid protein comprises the base sequence of FIG. 1 (SEQ ID NO: 1), FIG. 2 (SEQ ID NO: 2), FIG. 3 (SEQ ID NO: 3), FIG. 4 (SEQ ID NO: 4), FIG. 5 (SEQ ID NO: 5), FIG. 6 (SEQ ID NO: 6), FIG. 7 (SEQ ID NO: 7), FIG. 8 (SEQ ID NO: 8), or FIG. 9 (SEQ ID NO: 9).
 5. A composition comprising the DNA of claim 1, 2, 3, or 4 as reagent for diagnosis.
 6. An expression vector, comprising the DNA of claim 1, 2, 3, or
 4. 7. A transformant, comprising the expression vector of claim
 6. 8. A protein encoded by the DNA of claim 1, 2, 3, or
 4. 9. The protein of claim 8, wherein the protein is a L1 papilloma virus major capsid protein.
 10. The protein of claim 8, wherein the protein is a papilloma virus minor capsid protein.
 11. A virus-like particle, comprising the L1 papilloma virus major capsid protein of claim
 9. 12. The virus-like particle of claim 11, comprising additional a papilloma virus minor capsid protein.
 13. A composition comprising the protein of claim 8 as reagent for diagnosis, treatment and/or vaccination.
 14. A composition comprising the virus-like particle of claim 11 as reagent for diagnosis, treatment and/or vaccination.
 15. A pharmaceutical composition comprising the protein of claim 8, and a pharmaceutically acceptable carrier.
 16. A composition comprising the virus-like particle of claim 12 as reagent for diagnosis, treatment and/or vaccination.
 17. A pharmaceutical composition comprising the virus-like particle of claim 12, and a pharmaceutically acceptable carrier.
 18. An antibody, directed against the protein of claim
 8. 19. An antibody, directed against the virus-like particle of claim
 11. 20. A pharmaceutical composition comprising the antibody of claim 18, and a pharmaceutically acceptable carrier.
 21. A composition comprising the antibody of claim 18 as reagent for diagnosis and/or treatment.
 22. A composition comprising the antibody of claim 19 as reagent for diagnosis and/or treatment.
 23. A pharmaceutical composition comprising the antibody of claim 19, and a pharmaceutically acceptable carrier.
 24. A method of producing a DNA sequence comprising a nucleotide sequence encoding an L1 papilloma virus major capsid protein, comprising:(a) isolating the total DNA from a biopsy of epithelial neoplasm, (b) hybridizing under stringent conditions the total DNA of (a) with a DNA of claim 1, 2, 3, or 4 thereby detecting a papilloma virus genome included in the total DNA of (a), and (c) cloning the total DNA of (a), including the papilloma virus genome, in a vector and optionally subcloning the resulting clonc, all of the steps originating from the conventional DNA recombination technique.
 25. A method of producing the protein of claim 8, comprising the cultivation of the transformant containing an expression vector encoding said protein under suitable conditions.
 26. A method of detecting a papilloma virus DNA, comprising:(a) hybridizing under stringent conditions the DNA of claim 1, 2, 3, or 4 to a DNA sample; and (b) identifying papilloma virus in said DNA sample by detecting a hybridization signal.
 27. A pharmaceutical composition comprising the virus-like particle of claim 11, and a pharmaceutically acceptable carrier. 